1,840 research outputs found

    Investigating the Effect of Spatial Distribution and Spatiotemporal Information on Speciation using Individual-Based Ecosystem Simulation

    Get PDF
    In this paper, we investigate the impact of species’spatial and spatiotemporal distribution information onspeciation, using an individual-based ecosystem simulation(Ecosim). For this purpose, using machine learning techniques,we try to predict if one species will split in near future. Because ofthe imbalanced nature of our dataset we use smote algorithm tomake a relatively balanced dataset to avoid dismissing the minorclass samples. Experimental results show very good predictionsfor the test set generated from the same run as the learning set. Italso shows good results on test sets generated from different runsof Ecosim. We also observe superior results when we use, for thelearning set, a run with more species compare to a run with lessspecies. Finally we can conclude that spatial and spatiotemporalinformation are very effective in predicting speciation

    Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models

    Full text link
    This report introduces the Attention Visualizer package, which is crafted to visually illustrate the significance of individual words in encoder-only transformer-based models. In contrast to other methods that center on tokens and self-attention scores, our approach will examine the words and their impact on the final embedding representation. Libraries like this play a crucial role in enhancing the interpretability and explainability of neural networks. They offer the opportunity to illuminate their internal mechanisms, providing a better understanding of how they operate and can be enhanced. You can access the code and review examples on the following GitHub repository: https://github.com/AlaFalaki/AttentionVisualizer.Comment: 12 pages, 15 figure

    Genome-scale approach proves that the lungfish-coelacanth sister group is the closest living relative of tetrapods with the BEST program

    Get PDF
    The origin of tetrapods has not been resolved for decades. Three principal hypotheses (lungfish-tetrapod, coelacanth-tetrapod, or lungfish-coelacanth sister group) have been proposed. We used the Bayesian method under the coalescence model with the latest program (BEST) to perform a phylogenetic analysis for seven relevant taxa and 43 nuclear genes encoding amino acid residues with the jackknife method for taxon sub-sampling. The results, combined with those of other three genome-scale approaches, successfully prove the hypothesis that lungfishes and coelacanths form a monophyletic sister group and are equally related to tetrapods supported by high Bayesian posterior probabilities of the branch (a lungfish-coelacanth clade) and high taxon jackknife supports

    An individual-based evolving predator-prey ecosystem simulation using a fuzzy cognitive map as the behavior model

    Get PDF
    This paper presents an individual-based predator-prey model with, for the first time, each agent behavior being modeled by a Fuzzy Cognitive Map (FCM), allowing the evolution of the agent behavior through the epochs of the simulation. The FCM enables the agent to evaluate its environment (e.g., distance to predator/prey, distance to potential breeding partner, distance to food, energy level), its internal state (e.g., fear, hunger, curiosity) with memory and choosing several possible actions such as evasion, eating or breeding. The FCM of each individual is unique and is the outcome of the evolution process throughout the simulation. The notion of species is also implemented in a way that species emerge from the evolving population of agents. To our knowledge, our system is the only one that allows modeling the links between behavior patterns and speciation. The simulation produces a lot of data including: number of individuals, level of energy by individual, choice of action, age of the individuals, average FCM associated to each species, number of species. This study investigates patterns of macroevolutionary processes such as the emergence of species in a simulated ecosystem and proposes a general framework for the study of specific ecological problems such as invasive species and species diversity patterns. We present promising results showing coherent behaviors of the whole simulation with the emergence of strong correlation patterns also observed in existing ecosystems

    The use of Depletion methods to assess Mediterranean cephalopod stocks under the current EU Data Collection Framework

    Get PDF
    Fuelled by the increasing importance of cephalopod fisheries in Europe, scientists and stakeholders have demanded their assessment and management. However, little has been done to improve the data collection under the EU Data Collection Framework (DCF) in order to analyse cephalopod populations. While the DCF allows member states to design flexible national sampling programmes, it establishes the minimum data requirements (MDR) each state is obliged to fulfil. This study was performed to investigate whether such MDR currently set by the DCF allow the application of depletion models (DMs) to assess European cephalopod stocks. Squid and cuttlefish fisheries from the western Mediterranean were used as a case study. This study sheds doubt on the suitability of the MDR to properly assess and manage cephalopod stocks by means of DMs. Owing to the high plasticity of life-history traits in cephalopod populations, biological parameters should be estimated during the actual depletion period of the fished stocks, rather than performing triennial sampling as established by the DCF. In order to accurately track the depletion event, the rapid growth rates of cephalopods implies that their populations should be monitored at shorter time scales (ideally weekly or biweekly) instead of quarterly as specified by the DCF. These measures would not require additional resources of the ongoing DCF but a redistribution of sampling efforts during the depletion period. Such changes in the sampling scheme could be designed and undertaken by the member states or directly integrated as requirements.This study was performed under the Data Collection Framework (cofunded by the EU and the Spanish Institute of Oceanography, IEO) and the CONFLICT project (CGL2008-958; funded by the Spanish Ministry of Science and Innovation). SK was financed by an IEO-FPI grant (2011/11)Peer Reviewe

    Conserved transcription factor binding sites of cancer markers derived from primary lung adenocarcinoma microarrays

    Get PDF
    Gene transcription in a set of 49 human primary lung adenocarcinomas and 9 normal lung tissue samples was examined using Affymetrix GeneChip technology. A total of 3442 genes, called the set M AD, were found to be either up- or down-regulated by at least 2-fold between the two phenotypes. Genes assigned to a particular gene ontology term were found, in many cases, to be significantly unevenly distributed between the genes in and outside M AD. Terms that were overrepresented in M AD included functions directly implicated in the cancer cell metabolism. Based on their functional roles and expression profiles, genes in M AD were grouped into likely co-regulated gene sets. Highly conserved sequences in the 5 kb region upstream of the genes in these sets were identified with the motif discovery tool, MoDEL. Potential oncogenic transcription factors and their corresponding binding sites were identified in these conserved regions using the TRANSFAC 8.3 database. Several of the transcription factors identified in this study have been shown elsewhere to be involved in oncogenic processes. This study searched beyond phenotypic gene expression profiles in cancer cells, in order to identify the more important regulatory transcription factors that caused these aberrations in gene expressio

    Cooperative Metaheuristics for Exploring Proteomic Data

    Get PDF
    Most combinatorial optimization problems cannotbe solved exactly. A class of methods, calledmetaheuristics, has proved its efficiency togive good approximated solutions in areasonable time. Cooperative metaheuristics area sub-set of metaheuristics, which implies aparallel exploration of the search space byseveral entities with information exchangebetween them. The importance of informationexchange in the optimization process is relatedto the building block hypothesis ofevolutionary algorithms, which is based onthese two questions: what is the pertinentinformation of a given potential solution andhow this information can be shared? Aclassification of cooperative metaheuristicsmethods depending on the nature of cooperationinvolved is presented and the specificproperties of each class, as well as a way tocombine them, is discussed. Severalimprovements in the field of metaheuristics arealso given. In particular, a method to regulatethe use of classical genetic operators and todefine new more pertinent ones is proposed,taking advantage of a building block structuredrepresentation of the explored space. Ahierarchical approach resting on multiplelevels of cooperative metaheuristics is finallypresented, leading to the definition of acomplete concerted cooperation strategy. Someapplications of these concepts to difficultproteomics problems, including automaticprotein identification, biological motifinference and multiple sequence alignment arepresented. For each application, an innovativemethod based on the cooperation concept isgiven and compared with classical approaches.In the protein identification problem, a firstlevel of cooperation using swarm intelligenceis applied to the comparison of massspectrometric data with biological sequencedatabase, followed by a genetic programmingmethod to discover an optimal scoring function.The multiple sequence alignment problem isdecomposed in three steps involving severalevolutionary processes to infer different kindof biological motifs and a concertedcooperation strategy to build the sequencealignment according to their motif conten

    Optimization and performance testing of a sequence processing pipeline applied to detection of nonindigenous species

    Get PDF
    Genetic taxonomic assignment can be more sensitive than morphological taxonomic assignment, particularly for small, cryptic or rare species. Sequence processing is essential to taxonomic assignment, but can also produce errors because optimal parameters are not known a priori. Here, we explored how sequence processing parameters influence taxonomic assignment of 18S sequences from bulk zooplankton samples produced by 454 pyrosequencing. We optimized a sequence processing pipeline for two common research goals, estimation of species richness and early detection of aquatic invasive species (AIS), and then tested most optimal models’ performances through simulations. We tested 1,050 parameter sets on 18S sequences from 20 AIS to determine optimal parameters for each research goal. We tested optimized pipelines’ performances (detectability and sensitivity) by computationally inoculating sequences of 20 AIS into ten bulk zooplankton samples from ports across Canada. We found that optimal parameter selection generally depends on the research goal. However, regardless of research goal, we found that metazoan 18S sequences produced by 454 pyrosequencing should be trimmed to 375–400 bp and sequence quality filtering should be relaxed (1.5 ≤ maximum expected error ≤ 3.0, Phred score = 10). Clustering and denoising were only viable for estimating species richness, because these processing steps made some species undetectable at low sequence abundances which would not be useful for early detection of AIS. With parameter sets optimized for early detection of AIS, 90% of AIS were detected with fewer than 11 target sequences, regardless of whether clustering or denoising was used. Despite developments in next-generation sequencing, sequence processing remains an important issue owing to difficulties in balancing false-positive and false-negative errors in metabarcoding data
    • …
    corecore